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ABSTRACT 

This paper describes CAPE, a programming environment 
that combines Clips And Perl with Extensions. Clips is 
an efficient and expressive forward-chaining rule-based sys- 
tem with a flexible object system (supporting both message 
passing and generic functions). Perl is a popular proce- 
dural language with extTemely powerful regular expression 
matching facilities, and a huge library of freely available 
software modules. Cape closely integrates these two pro- 
gramming languages, and provides extensions to facilitate 
building systems with an intimate mixture of the two lan- 
guages. These features make cape an excellent language for 
building knowledge- based systems to exploit the opportuni- 
ties being presented by the Internet. 

This paper describes the current version of CAPE and the 
facilities it offers programmers, including the demonstration 
systems and "component applications" that are distributed 
with it. The use of the system is then discussed with refer- 
ence to an application for automatically generating graphs 
of remote web sites. Finally, planned developments of the 
system are indicated. 

General Terms 

Rule-based Programming, Knowledge-Based System Tools, 
clips, Perl 

1. BACKGROUND 

Conventional Knowledge Based Systems (kbss) involve the 
controlled manipulation of symbolic descriptions of the world. 
Extracting Such symbolic descriptions from sensory data was 
(and still is) a major challenge in its own right, addressed 
by specific fields such as speech recognition and vision, and 
most work in kbss tries to sidestep it. 
At first, kbss worked with very restricted amounts of in- 
formation (e.g. chess positions) and typically solving prob- 
lems described in special format data files. They began to 
achieve much wider use/acceptance when Expert Systems 
began to ask questions of users, ("Has the patient got a 
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rash?"), thereby using the user's perceptual abilities and 
common sense to analyse the world into the abstract terms 
that the kbss required. Subsequently, kbss began to access 
data, either statically (i.e. Kbs linked to database) or by 
analysing real-time feeds (e.g. network monitoring). To a 
greater or Lesser extent, these systems can determine what 
data is relevant, and when, but only within a fixed (or at 
least limited) range of data for which the format and seman- 
tics were known to the system builders (or maintainers). 
Now, we have the Internet. 

By providing "universal" connectivity between computers, 
the Internet has created a previously inconceivable range of 
possibilities for any program. Innumerable documents — 
ranging from the ephemera of electronic "news" discussions 
through to published documents (including newspapers, lit- 
erature and technical documents) — can be retrieved from 
almost anywhere in the world within moments. Programs 
can instantly query a multitude of databases, covering ev- 
erything from holiday prices to registered trade marks, and 
call on other computational services such as dynamic infor- 
mation feeds (e.g. stock prices), notification of changes and 
so forth. Finally, the Internet allows programs to easily in- 
teract with huge numbers of people, either by email or using 
Web- based forms. 

The number of sources of information and services continues 
to grow at an incomprehensible pace. Crucially, though, 
those sources are outside the control of — indeed, since they 
are continually growing and evolving, outside the knowledge 
of—the system's builders 

Fully exploiting the opportunity presented by the Internet — 
e.g. by building systems which can analyse and filter infor- 
mation for a particular purpose — will in several ways require 
a new breed of KBSS. They will, at the very least, have to be 
easily configured to deal with new sources of information, 
and should, more importantly, be able to explore, discover 
and operate within a huge and ever- changing world of infor- 
mation. 

By making it easier to both locate and distribute software, 
the rise of the Internet has also greatly increased the num- 
ber that readily accessible software packages. These include 
not only end- user applications, "plugins" and extensions for 
system tools and programming languages, but also library 
components for more- or -less standard tasks, such as parsing 
standard document formats and mark-ups. 
In particular, there are efficient implementations of data- 
intensive algorithms such as neural nets, statistical analy- 
sis or "data mining 3 ' toots, and information retrieval and 
other techniques for processing free text based on word oc- 
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currence statistics. Such packages are potentially valuable 
new sources of information. It will therefore be increas- 
ingly important that software tools are able to work with 
them — either by incorporating libraries or by interfacing to 
stand-alond packages. 

2. REQUIREMENTS 

To best exploit the opportunities presented by the rise of 
Internet, a kbs programming tool should offer . . . 

Expressive language The language should combine ob- 
ject oriented programming and symbolic reasoning. It 
should also provide efficient memory management, and 
support both incremental program development and 
interactive debugging. 

Searching and pattern matching Only a very small pro- 
portion of the oceans of material available can be read- 
ily assimilated by a program. Most is free text, and 
much of the rest — such as results of database queries- 
is tables and other more-or-less regular "arrays" . Han- 
dling such material requires searching for words and 
phrases, recognising repeating structures and extract- 
ing content from them. 

Support for Server Building The Internet makes it nat- 
ural to make a system available as a "server" that 
users— or other programs (i.e. agents) — can contact 
as required. However, not all kbss are well suited to 
starting on demand (e.g. for cci ), and many can re- 
quire long periods of processing. An ideal kbs toot 
should support building processes that remain respon- 
sive while reasoning. 

Easy Interaction with Other software As well as pro- 
viding services to other programs, future KBS must be 
able to use the wide range of libraries or packages that 
are available. Tools must allow linking to C (the most 
common language for package distribution), and facil- 
itate interacting with other stand- along processes. 

One way to achieve this is by building libraries for a lan- 
guage which has some of the desired properties. This is 
the approach taken by Jess, the Java Expert System Shell 
(see http://herzbergl.ca.sandia.gov/jess/), which im- 
plements a rule- based engine within Java. 
The alternative is to combine two (or more) languages which 
each exhibit some of the desired properties into a single pro- 
gramming environment, in the manner of Poplog [1]. This 
is the approach adopted by CAPE. 

3. CAPE 

Cape (Clips And Perl with. Extensions) is a programming 
environment which allows programs to be written in an in- 
timate mixture of the CUPS [2] (CLIPS: C Language Inte- 
grated Production System) rule- based and object-oriented 
language, and Perl [3|, a procedural programming language. 
CLIPS was chosen because it closely integrates a very fast 
forward chaining rule- based system with a flexible object 
system that supports both message passing and generic func- 
tions. CUPS was initially a partial re-implementation, in C, 
of Inference Art [6], which was arguably the most powerful 
of the lisp-based "knowledge representation tookits" that 
emerged during the mid/late 1980s. Its rule language fea- 
tures very powerful and efficient pattern matching facilities 



based on the RETE algorithm [7], and including the ability 
to match against the state of objects, and a truth mainte- 
nance mechanism. There is tutorial material for CLIPS in 
[4], and clips itself is accessible via 

http://www.ghgcorp.com/clips/ 
along with detailed manuals and pointers to related software 
and information, 

Perl was chosen because of its extremely powerful regular 
expression matching facilities, and its huge library of freely 
available software modules. It also supports complex data 
structures, and (combinations of symbolic patterns), There 
are many books about Perl programming (e.g. [5]). Perl 
itself is accessible via, http://www.perl.com/, along with 
manuals and pointers to a huge quantity of related software 
and information. 

To the two separate language sub-systems, cape adds inter - 
calling and data transfer. It also provides mechanisms for 
synchronising the initialisation of data structures between 
the two programming systems. Finally, it uses clips *s run 
function facility to support monitoring Internet sockets while 
reasoning, allowing the system to remain responsive to socket 
activity while reasoning. 

3.1 CAPE Program Structure 

When cape reads a program, it starts by breaking the stream 
of characters read into "chunks" of either clips or Perl code, 
or cape's own commands. Because clips's syntax is ex- 
tremely simple, valid clips code can be recognised from the 
first non-blank character. Since cape's own commands are 
required to be prefixed by an "\" t anything else is treated as 
Perl code. The nature of the chunk being read is then used 
to determine how it will end. Balancing parentheses makes 
this easy for clips code. Perl chunks are terminated by 
heuristics based on commonly followed layout conventions, 
but cape also provides mechanisms to forcibly terminate a 
chunk if these heuristics are inappropriate. 
Once a chunk has been read, it is pre-processed. Cape al- 
lows textual substitutions to be defined using Perl regular 
expressions, and for arbitrary user-defined functions to be 
called. Cape itself as yet makes only minimal use of this 
mechanism (to allow Perl scalar variables to be accessed 
from within CLIPS code), but the mechanism is provided 
primarily as a hook to support future language extensions. 
Finally, the pre-processed code is passed to one or other of 
the two Language interpreters for execution. 
Finally, Cape offers the user a read/eval/print loop, which 
prompts for a command, reads and pre-processes a chunk as 
described above, sends it for evaluation by either the Perl 
or Clips interpreter, and ensures that the result is printed. 
Listener itself is in Perl, so can be redefined, and thus cus- 
tomised or extended, by user. 

3.2 Inter-Language Interaction 

The most general bridge between CLIPS and Perl is through 
evaluation. Both languages provide an eval function which 
takes a string argument that it parses and executes. Both 
these functions are made available to the "other" language, 
along with a function which interprets its argument as a 
Cape "chunk" which is classified as described above. String 
to be evaluated through this mechanism are pre-processed 
in the same way as code that CAPE has read. 
In addition to this completely general evaluation mecha- 
nism, CAPE also provides functions for directly calling named 
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functions, and for sending messages. When these functions 
are called, C code within CAPE maps their arguments di- 
rectly from the C data structures underlying the calling lan- 
guage into the corresponding C data structures for the tar- 
get language. Any result returned by the called function is 
similarly transformed into the fomn required by the calling 
language. Since both CLIPS and Perl employ dynamic mem- 
ory a/location and garbage collection, the code for mapping 
arguments between the systems must itself allocate (and de- 
allocate) memory in order to avoid problems caused by the 
interaction of the two memory management systems. 
Cape itself only supports the mapping of simple data types 
(strings and numbers, and lists of these) between the com- 
ponent languages. This is felt to be a good compromise 
between power and simplicity of implementation (and thus 
comprehensibilty and reliability). It would be possible to 
provide more complex mappings (e.g. transforming a com- 
plete clips object into a Perl hash or object). However, 
there is as yet no compelling case for providing such map- 
pings in isolation, as opposed to as part of a comprehensive 
integration of the languages* object sys terns, which while 
desirable (see "Future work" below), clearly involves con- 
siderable design, programming and testing effort. 
The facilities just described allow a clips programs to call 
specific Perl subroutines, evaluate strings or match Perl reg- 
ular expressions, on either the left- or right-hand side of a 
rule — that is, in either the condition or the action part — or 
from within the body of a CLIPS function or message han- 
dler. In particular, rule firing can be made dependent on 
successful matching of Perl regular expressions. 
In the other direction, Perl programs can call functions de- 
fined in CLIPS — both normal functions and generic functions — 
and can send messages to clips objects. There are also Perl 
subroutines defined for asserting facts into CLIPS working 
memory, and for having CLIPS evaluate an arbitrary string. 

3*3 Sockets 

Cape provides functions for initiating and configuring In- 
ternet socket handling, and for monitoring sockets while the 
CLIPS rule engine is running. These functions are available 
from both CLIPS and Perl. The current state of port moni- 
toring and socket connection is used to continually update 
a collection of CLIPS objects which are accessible from both 
languages and available for pattern matching in CLIPS rules. 
This means systems to provide services via an Internet port 
can be built using only cape itself. 

The lowest level handling of socket activity is done in C 
within the core of cape. Clips is able to call an arbitrary 
function after each rule firing, and CAPE uses this to check 
for and accept connections to any ports being monitored, 
and to aggregate any data received from any current con- 
nection. Then, whenever the buffer associated with a con- 
nection is full, or its record break character (currently new 
line) is received, the accumulated byte string is passed to a 
connection-specific Perl function for filtering. This function 
could simply respond directly to the input received. Typ- 
ically, though, it will map it into CLIPS working memory, 
thereby allowing the full power of the pattern matching to 
be used to decide when and how to respond. 

4. PROGRAM STRUCTURE 

The clips rule engine has a very powerful mechanism for 
deciding which rule to fire at any point. This can be used to 



make extremely subtle and flexible decisions about the flow 
of control within the system — i.e. about what the cape ap- 
plication should do next. Doing this requires structuring 
the system as a whole as a rule-based system, so that fir- 
ing rules triggers activities that are possibly implemented 
in Perl. A system implemented in this way can remain re- 
sponsive to external events, provided that the code executed 
by any particular rule does not take too long to run. Perl 
is used to help map "the world" into symbols that CLIPS 
can then reason about, and then to interpret the symbolic 
results of that reasoning into actions in the world. 
Cape has been designed (and primarily tested and exer- 
cised) with this model of system operation in mind. How- 
ever, there is no (known!) obstacle to building an "inverted" 
system — i.e. one in which overall control resides in a Perl 
program (or an interface generating call-backs into Perl), 
within which some subroutines specify or initiate rule-based 
reasoning. 

5, USING CAPE 

In addition to the demonstration and component applica- 
tions distributed with the system (see below), cape is being 
used to develop dime: the Distributed Information Manip- 
ulation Environment. Dime is a collection of components 
useful for building systems to retrieve and manipulate dis- 
tributed data. These components include a highly flexible 
document cache (which is able to fetch documents when nec- 
essary, store them locally, and search them in various ways) 
and a framework for specifying and controlling interaction 
with remote search systems. 

Dime is being used to develop two research systems. The 
first is K-DIME, or Kansei DIME [8), which is concerned with 
fetching and filtering images based on subjective criteria, 
or kansei. The second system being build using DIME is 
Maxwell, a ''smart" agent intended to both query and com- 
bine the results from a number of book vendors databases on 
behalf of the user. An initial version of Maxwell was imple- 
mented in Emacs Lisp (see [9] and [10]), and a more powerful 
version is currently being implemented in cape using DIME. 
The remainder of this section will illustrate the way various 
features of cape are used and interact by describing a sys- 
tem for generating graphical maps of web sites based on the 
structure of the site and the nature of the pages it contains. 
Given an initial url and a regular expression to delimit the 
area to be mapped, the system fetches the relevant pages 
and analyses them to extract the links that they contain. 
The eventual aim is to generate a graph and output it in 
the format required by a graph layout system. The example 
system is geared towards the layout system found in the 
"thinking support tool" D- Abductor [11 J. 
Although one can trivially generate a graph description from 
a web site, such approaches do not scale well; even modest 
web sites give rise to graphs that contain too many nodes to 
be drawn effectively. Producing workable graphs, therefore, 
involves either generating or recognising structures that can 
be used as a basis for omitting or temporarily hiding parts 
of the graph. ' 

Different graph layout systems will hit si&e limits at differ- 
ent points. Indeed, the layout system used in this system 
is not particularly good in this respect, since because of 
the way it arranges nodes to reflect graph structure, cer- 
tain site topologies cause it to make very inefficient use of 
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screen space. However, that simply means that the limi- 
tations of drawing complete graphs are apparent a sooner, 
so the scope for intelligently transforming the site descrip- 
tion to overcome them can be explored while working with 
smaller web sites. 

A system for mapping a remote web site must fetch the pages 
that are relevant to the map, identify their inter-connections, 
and their links to the outside world, and then decide how 
they can be clustered, and which links, pages, servers and 
clusters should be shown in the particular graph- 
Our programming language should ideally: 

• allow decisions to be expressed clearly in ''domain" 
terms, such as web pages, servers, links, graph nodes 
and so forth 

• use standard libraries for actually fetching documents 

• provide good pattern matching for analysing URLs 
and extracting information from document contents 

Cape meets these criteria well. 
The mapping system uses Perl for 

Accessing libraries: Pages are fetched through the docu- 
ment cache component application, which in turn uses 
the HTTP package from CPAN. 

Manipulating urls: Extracting and canonicalising proto- 
col, host names, file extension etc. 

Extracting links: Using regular expressions to search the 
pages retrieved to look for links, image maps and so 
forth, 

Gathering statistics: Using regular expressions to anal- 
yse the documents handled, finding and counting links, 
words in anchor texts and so forth. 

However, the top-level of the system is written in clips. Jn 
particular, the system's ontology is described by defining 
clips objects for such concepts as pages, page sets, servers 
and directories. Clips rules and methods are then used for 
mapping pages into suitable graph descriptions. The current 
prototype contains just over 40 rules, doing such things as: 

• recognising and collapsing multiple links between the 
same nodes, 

• recognising and grouping (and potentially collapsing) 
sets of similarly-linked pages 

• selecting the necessary number of strategies for omit- 
ting details. These strategies include things like omit- 
ting servers with only one document on them, and col- 
lapsing servers with several documents 

• rendering information as properties of graphical enti- 
ties (e.g. colours, thicknesses etc.). 

This approach is taken to ensure that, as far as possible, 
rules can be written at the domain level, without reference 
to the mechanics of fetching or searching documents. The 
figures give some idea of the extent to which this has been 
achieved. 

Figure 1 shows the rule that is responsible for deciding to 
fetch a page. This is perfectly ordinary Clips rule matching 



(deirule decid«-to-ietch-page 

"If we know of a page relevant to our target, 

note that we should fetch it" 
(declare (salience -60)) 
(target ?query ?function) 
(object (is-a page) 
(name ?doc) 
(type html I unknown) 
(did ?*not -fetched*) 

(url ?url&: (perl-test ?f unction ?url))) 

=> 

(assert (to-fetch ?query i ?url ^doc))) 

Figure 1: A typical rule, written in domain terms 

against a fact and an object, and using a user-defined func- 
tion to test the value of one of the object's slots (url). The 
only unusual feature is the fact that the actual test on the 
url is carried out in Perl. The simplest way to have done 
this would have been to directly match the URL against a 
Perl regular expression defining the pages of interest by us- 
ing the cape function p-match-p, thus 

(url ?url\&: (p-match-p ?url ?regexp)) 
In fact, though, the rule used is somewhat more sophisti- 
cated in its use of Perl facilities. At the time when the target 
for the query is defined, the regular expression is passed to a 
Perl function which defines a second Perl function to check a 
URL against the regular expression, and returns the name of 
that function. Because this function is only used to match 
against a single (constant) regular expression, Perl can be 
told to optimise the matching process. The specially-defined 
function is then called using the CAPE function perl-test, 
which calls the Perl function named by first argument, and 
returns a CLIPS boolean. Other arguments to perl-test are 
converted to Perl primitives (in this case, the URL is con- 
verted to a Perl string) and passed in to the Perl function. 
Figure 2 illustrates a second rule, which recognises a group 
of documents that are "siblings" — i.e. that contain links 
to the same URL which have the same anchor text. The 
rule matches over facts describing the links that have been 
found in the various documents, finding three that refer 
to different document identifiers (?doc-id, ?doc-id2 and 
?doc-id3 but are otherwise identical. Facts like these can 
be generated in cape in a single line of code, just by re- 
questing the matches for an appropriate regular expression. 
They could, of course, be generated in clips, but it would 
require considerably more err or- prone and tedious program- 
ming. 

Note also the use of a second Perl function (same.doc) which 
compares two URLs to determine whether they refer to the 
same document. This involves more than just checking 
whether the URLs are identical, since it is also necessary to 
ignore any anchors within the document, and to disregard 
any trailing slash. 

6. STATUS AND FUTURE WORK 

Cape is a new tool which brings together two very different 
programming systems. It runs under a number of versions of 
unix, including Solaris and Linux. The core functionality of 
the current version of the system has been stable since the 
spring of 1998, and, as described above, the system itself 
has been used for the development of a number of research 
systems since the following summer. 
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(defrule spot-siblings 

"Find 'identical' link? to the same place from 
different sourees" 
(declare (salience 50)) 

Clink ?doc-id ?source ?ancbor ?dest $?rest> 
(not 

(object (is-a sibling-set) 
(anchor ^anchor) 
(destination ?dest))) 
Clink ?doc-id2ft"?doc-id 

"?souxce ?anchor ?dest $?rest) 
Clink ?doe-id3ft"?doc-idfc'?dod-id2 

"?source ?anchor ?dest $?rest) 
(object (is-a webitea) 
(name ?obj) 

(url ?uft: (perl-test same.doc ?dest ?u))) 

=> 

(bind ?ss (gensym "family-")) 

(make- instance ?ss of sibling-set 
(intarget TRUE) 
(status possible) 
(anchor ? anchor) 
(destination-obj ?obj) 
(destination ?dest))) 



Figure 2: A second rule, again almost entirely in domain 
terms 

The system was made available by ftp and announced on 
the Web in February 1999, and has been being downloaded 
over one hundred and fifty times since then. 
The core CAPE environment comprises 3000 lines of C code, 
plus 600 lines of Perl (and a small amount of CLIPS). The 
system comes with a thirty-one page manual ((12j) and 1500 
lines of CAPE code in two standard cape "component appli- 
cations", or "Capplets" (support for regression testing CAPE 
applications and a simple Web server). 
The distribution currently also includes three demonstration 
applications: 

Handshaking: A minimal demonstration of communica- 
tion between a pair of CAPE processes. An "inter- 
face agent 1 ' accepts user requests (via HTTP) for com- 
mands to be executed, and forwards them to the rel- 
evant "execution agent", which executes them in due 
course. The execution agent then contacts the inter* 
face agent with the results, which it then forwards to 
the user. 

Web server This application operates as a (partial) web 
server. It accepts a limited set of HTTP requests, lo- 
cates the relevant file and then replies with either an 
appropriate HTTP response — either an error, or the 
contents of the requested file (along with appropri- 
ate HTTP header information). However, the server 
also allows users to specify ( (using an HTML form 
that it generates) required transformations to pages , 
which are then subsequently applied to the pages be- 
ing served. This demonstration system, which uses the 
"webserve" component application, is about 350 lines 
of code. 

"Dungeon" This application provides a real- time multi- 
user "dungeon" game. Players access the system via 
a web browser, and are able to direct the actions of 
one of a number of "characters" moving within a sim- 
ple environment. In addition to moving about and 



manipulating objects, players are likely to encounter 
other characters, either controlled by other players, or 
by the computer. All descriptions of situations are 
generated from information structures (i.e. there is 
no "canned" text) based on the objects and locations 
known to the user. Computer- controlled characters 
can generate and follow multi-step plans. 

The entire application was build from scratch in five 
days. It contains a total of 2800 lines of cape code 
(about one third Perl), although re-implementation us- 
ing the webserve component application would reduce 
this substantially. 

One of the main difficult features of CAPE is the fact that 
CLIPS and PeTl have different syntaxes, CUPS being Lisp-like 
and Perl being generally (arguably "vaguely") C-like. This 
unfortunately means that fully exploiting CAPE requires rea- 
sonable proficiency in both languages, and in working with 
more than one language at a time. The possibility of pro- 
ducing a single "unified" syntax has considerable appeal. 
However, defining such a thing is not easy, since Perl syn- 
tax is already complex and very "dense 1 — that is, a high 
proportion of characters and character combinations are al- 
ready assigned meanings. Moreover, the appeal of a single 
syntax must be seen alongside the advantages of continuing 
to support unchanged the syntax of each of the underly- 
ing languages: access to the substantial bodies of software, 
documentation and expertise for these languages. 
There are a number of obvious extensions to CAPE, some of 
which will be added to the system in the near future: 

• Use the code pre-processing facilities extend clips to 
support backward chaining, or goal-oriented reason- 
ing, by using particular fact patterns to represent the 
preseuce of particular goals, which will then drive the 
forward-chaining rule-engine in CLIPS. 

• Use the code pre-processing facilities extend clips to 
support explicit declaration of relationship properties 
(e.g. transitivity, reflexivity, symmetry etc.) and the 
automatic definition of rules and methods to enforce 
and infer the consequences of this. 

• Clips and Pert both provide powerful object systems 
(albeit with quite different properties) and support for 
modules (multiple namespaces). They should be inte- 
grated. 

• Clips has powerful mechanisms for directing I/O by 
defining (in C) arbitrary routers for various logical in- 
formation types. Perl has sophisticated mechanisms 
for formatting and paginating output. Yet cape's own 
mechanisms for handling I/O through sockets is not 
yet related to either! 

• There are several additions to CAPE which would help 
implementing autonomous agents: specifically, mecha- 
nisms to allow the system to do things at a particular 
time, to facilitate the controlled mapping of specific 
kinds of information between CLIPS working memory 
and (ideally shared) persistent store and to support 
the KQML agent-communication standard. 

• Recent versions of Perl can be compiled to support 
multiple threads. In contrast, clips is only single- 
threaded, and although cape can be linked with a 
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multi-threaded Per!, it does nothing to allow multiple 
threads to be actually used. 

7, CONCLUSION 

Cape is a powerful tool for building a new generation of 
KBSS, It combines the strengths of two well-established tools 
with very powerful but complementary pattern-matching 
mechanisms. Perl's ability to search text and match pow- 
erful regular expressions is unequaled, while CLIPS provides 
powerful mechanisms for finding patterns of combinations 
of symbolic information. The cape programmer can exploit 
the strengths of both, using Perl to analyse documents or 
query results, and CLIPS to recognise and react to the com- 
binations of matches found. 

Cape provides powerful mechanisms to support a number 
of key activities: 

Symbolic reasoning Clips offers a very efficient forward 
chaining rule- based system with extremely expressive 
pattern matching, coupled with a highly flexible object- 
oriented system and supported by a truth-maintenance 

system. 

Data analysis/manipulation Perl has extremely power- 
ful regular expression matching coupled with very con- 
cise string handling and easy-to-use hash- based index* 
building and data structuring. 

Service Provision Cape's socket monitoring mechanisms 
allow a rule-based program to remain responsive to 
external activity even while it is reasoning. 

Standard languages/libraries Cape programs can use 
any of the enormous range of software available in 
CPan, the Comprehensive Perl Archive Network (ac- 
cessible via http://wut7.perl.com/). 

Interaction with software packages Perl provides very 
concise and flexible mechanisms for controlling and 
processing the results obtained from system commands 
and other external programs. Cape programmers can 
also exploit the tools for generating Perl "wrappers" 
for software components written in C, and make use 
of Perl's ability to dynamically load compiled code at 
run-time, 

Together, these features make cape a powerful tool for build- 
ing kbss that can exploit the opportunities offered by the 
Internet. 

CAPE is freely available, with full source, from 

http: //wvv .here . ed. ac . uk/"robert/CAPE 
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